Syntactic-Based Methods for Measuring Word Similarity

نویسندگان

Pablo Gamallo

Caroline Gasperin

Alexandre Agustini

José Gabriel Pereira Lopes

چکیده

This paper explores different strategies for extracting similarity relations between words from partially parsed text corpora. The strategies we have analysed do not require supervised training nor semantic information available from general lexical resources. They differ in the amount and the quality of the syntactic contexts against which words are compared. The paper presents in details the notion of syntactic context and how syntactic information could be used to extract semantic regularities of word sequences. Finally, experimental tests with Portuguese corpus demonstrate that similarity measures based on fine-grained and elaborate syntactic contexts perform better than those based on poorly defined contexts.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sentence Similarity Measuring by Vector Space Model Sentence Similarity Measuring by Vector Space Model

In Natural Language Processing and Text mining related works, one of the important aspects is measuring the sentence similarity. When measuring the similarity between sentences there are three major branches which can be followed. One procedure is measuring the similarity based on the semantic structure of sentences while the other procedures are based on syntactic similarity measure and hybrid...

متن کامل

Use of Common-Word Order Syntactic Similarity Metric for Evaluating Syllabus Coverage of a Question Paper

Syllabuses are used to ensure consistency between educational institutions. A modularized syllabus contains weightages assigned to different units of a subject. Different criteria like Bloom’s taxonomy, learning outcomes etc., have been used for evaluating the syllabus coverage of a question paper. But we have not come across any work that focuses on syntactic text similarity evaluation of unit...

متن کامل

CFILT-CORE: Semantic Textual Similarity using Universal Networking Language

This paper describes the system that was submitted in the *SEM 2013 Semantic Textual Similarity shared task. The task aims to find the similarity score between a pair of sentences. We describe a Universal Networking Language (UNL) based semantic extraction system for measuring the semantic similarity. Our approach combines syntactic and word level similarity measures along with the UNL based se...

متن کامل

SEMILAR: A Semantic Similarity Toolkit for Assessing Students' Natural Language Inputs

We present in this demo SEMILAR, a SEMantic similarity toolkit. SEMILAR includes offers in one software environment several broad categories of semantic similarity methods: vectorial methods including Latent Semantic Analysis, probabilistic methods such as Latent Dirichlet Allocation, greedy lexical matching methods, optimal lexico-syntactic matching methods based on word-to-word similarities a...

متن کامل

Combining Word Embedding and Lexical Database for Semantic Relatedness Measurement

While many traditional studies on semantic relatedness utilize the lexical databases, such as WordNet or Wikitionary, the recent word embedding learning approaches demonstrate their abilities to capture syntactic and semantic information, and outperform the lexicon-based methods. However, word senses are not disambiguated in the training phase of both Word2Vec and GloVe, two famous word embeddi...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2001

Syntactic-Based Methods for Measuring Word Similarity

نویسندگان

چکیده

منابع مشابه

Sentence Similarity Measuring by Vector Space Model Sentence Similarity Measuring by Vector Space Model

Use of Common-Word Order Syntactic Similarity Metric for Evaluating Syllabus Coverage of a Question Paper

CFILT-CORE: Semantic Textual Similarity using Universal Networking Language

SEMILAR: A Semantic Similarity Toolkit for Assessing Students' Natural Language Inputs

Combining Word Embedding and Lexical Database for Semantic Relatedness Measurement

عنوان ژورنال:

اشتراک گذاری